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It is not believed that extensions of time or fees for net addition of claims are required, 
beyond those which may otherwise be provided for in documents accompanying this paper. 
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AMENDMENTS 

In the Claims 

The following is a marked-up version of the claims with the language that is underlined 
(" ") being added and the language that contains strikethrough (" — ") being deleted: 

1. (Currently Amended) A method comprising: 

receiving a first email message from a simple mail transfer protocol (SMTP) server, the 
first email message comprising displaying characters and non-displaying characters, the non- 
displaying characters including non-displaying comments and non-displaying control characters; 
the first email message further comprising: 

a 32-bit string indicative of a length of the first email message; 

a text body; 

an SMTP a simple mail transfer protocol email address that includes a user 
name and a domain name; 
an attachment; 

searching for the non-displaying characters in the first email message; 

removing the non-displaying characters, including the non-displaying comments and the 
non-displaying control characters; 

determining non-alphabetic displaying characters in the first email message, where 
determining the non-alphabetic displaying characters includes a per-character analysis that 
recursively determines for each character whether: 

a character is a non-alphabetic character; 

if the character is a non-alphabetic character, whether the character is a space; 
if the character is a space, determine whether the space is adjacent to a solitary 

"i" or "a"; an4 
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in response to a determination that the space is not adjacent to a solitary "i" or 
"a", deleting the non-alphabetic character: and 

if the non-alphabetic character is not a space, filtering the determined non- 
alphabetic displaying characters from the first email message; 

generating a phonetic equivalent for each word that includes only alphabetic displaying 
characters that has a phonetic equivalent; 

tokenizing the phonetic equivalents in a displaying portion of the text body to generate a 
plurality of body tokens representative of words in the text body; 

tokenizing the SMTP simple mail transfer protocol email address to generate an address 
token representative of the SMTP simple mail transfer protocol email address; 

tokenizing the domain name to generate a domain token that is representative domain 

name; 

tokenizing the attachment to generate an attachment token that is representative of the 
attachment, wherein tokenizing comprises: 

generating a 128-bit MD5 hash of the attachment; 

appending the 32-bit string to the generated MD5 hash to produce a 160-bit 
number; and 

UUencoding the 160-bit number to generate the attachment token representative 
of the attachment; 

determining a corresponding spam probability value for each of the plurality of body 
tokens, the address token, the domain token, and the attachment token; 

determining whether at least one of the plurality of body tokens, the address token, the 
domain token, and the attachment token is present in a database of tokens and, in response to 
a determination that at least one of the plurality of body tokens, the address token, the domain 
token, and the attachment token is present in the database of tokens: 
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updating the spam probability value of the plurality of body tokens, the address 
token, the domain token, and the attachment token; and 

sorting the plurality of body tokens, the address token, the domain token, and the 
attachment token in accordance with the corresponding spam probability value to 
determine a predefined number of interesting tokens, the predefined number of 
interesting tokens being a subset of the plurality of body tokens, the address token, the 
domain token, and the attachment token; 

classifying the plurality of body tokens, the address token, the domain token, and 
the attachment token as spam, non-spam, or neutral; 

selecting the predefined number of interesting tokens, to create selected 
interesting tokens, the selected interesting tokens being the plurality of body tokens, the 
address token, the domain token, and the attachment token having a greatest non- 
neutral probability values; 

performing a Bayesian analysis on the selected interesting tokens to generate a 
spam probability; 

categorizing the first email message as a function of the spam probability; and 
filtering a second email message. 

2.-5. (Canceled) 

6. (Currently Amended) A method comprising: 

receiving, at a computing device, a first email message comprising a text body, an 
SMTP a simple mail transfer protocol email address, an attachment, and a domain name 
corresponding to the SMTP simple mail transfer protocol email address, the text body including 
displaying characters and non-displaying characters; 
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searching for the non-displaying characters in the first email message; 

removing the searched non-displaying characters, including non-displaying comments 
and non-displaying control characters; 

determining non-alphabetic displaying characters in the first email message, where 
determining the non-alphabetic displaying characters includes a per-character analysis that 
recursively determines for each character whether: 

a character is a non-alphabetic character; 

if the character is a non-alphabetic character, whether the character is a space; 
if the character is a space, determine whether the space is adjacent to a solitary 

"i" or "a"; 

in response to a determination that the space is not adjacent to a solitary "i" or 
"a", deleting the non-alphabetic character; and 

if the non-alphabetic character is not a space, filtering the determined non- 
alphabetic displaying characters from the first email message; 

tokenizing the SMTP simple mail transfer protocol email address to generate an address 
token representative of the displaying characters of the SMTP simple mail transfer protocol 
email address; 

tokenizing the attachment to generate an attachment token that is representative of the 
attachment; 

tokenizing the domain name to generate a domain token representative of the domain 

name; 

determining a corresponding spam probability value from the address token, the 
attachment token, and the domain token; 

determining whether at least one of the address token, the attachment token, and the 
domain token is present in a database of tokens and, in response to a determination that at 



Serial No.: 10/685,656 
Art Unit: 2442 
Page 6 

least one of the address token, the attachment token, and the domain token is present in the 
database of tokens: 

updating the spam probability value of at least one of the address token, the 
attachment token, and the domain token; 

sorting the address token, the attachment token, and the domain token in 
accordance with the corresponding spam probability value to determine a predefined number of 
interesting tokens, the predefined number of interesting tokens being a subset of the address 
token, the attachment token, and the domain token; and 

filtering a second email message. 

7.-10. (Canceled) 

1 1 . (Currently Amended) The method of claim 6, wherein determining the spam 
probability comprises: 

assigning an address spam probability value to the address token representative of the 
SMTP simple mail transfer protocol email address; 

assigning a domain spam probability value to the domain token representative of the 
domain name; and 

generating a Bayesian probability value using the address spam probability and the 
domain spam probability assigned to the address token and the domain token. 

12. (Previously Presented) The method of claim 11, wherein determining the spam 
probability further comprises: 

comparing the Bayesian probability value with a predefined threshold value. 
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13. (Previously Presented) The method of claim 12, wherein determining the spam 
probability further comprises: 

categorizing the first email message as spam in response to the Bayesian probability 
value being greater than the predefined threshold. 

14. (Previously Presented) The method of claim 12, wherein determining the spam 
probability further comprises: 

categorizing the first email message as non-spam in response to the Bayesian 
probability value being not greater than the predefined threshold. 

15. (Canceled) 

16. (Previously Presented) The method claim 6, wherein receiving the first email 
message further comprises: 

receiving the first email message including a text body. 

17. (Previously Presented) The method of claim 16, further comprising: 
tokenizing the words in the text body to generate body tokens representative of the 

words in the text body. 

18. (Canceled) 

19. (Previously Presented) The method of claim 17, wherein determining the spam 
probability comprises: 

assigning a body spam probability value to each of the body tokens representative of the 
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words in the text body; 

assigning an attachment spam probability value to the attachment token representative 
of the attachment; and 

generating a Bayesian probability value using the body spam probability value and the 
attachment spam probability value assigned to the body tokens and the attachment token. 

20. (Previously Presented) The method of claim 19, wherein determining the spam 
probability further comprises: 

comparing the Bayesian probability value with a predefined threshold value. 

21 . (Previously Presented) The method of claim 20, wherein determining the spam 
probability further comprises: 

categorizing the first email message as spam in response to the Bayesian probability 
value being greater than the predefined threshold. 

22. (Previously Presented) The method of claim 20, wherein determining the spam 
probability further comprises: 

categorizing the first email message as non-spam in response to the Bayesian 
probability value being not greater than the predefined threshold. 

23. (Currently Amended) A system comprising: 

a memory component that stores at least the following: 

email receive logic configured to receive a first email message comprising an 
SMTP a simple mail transfer protocol email address, a domain name corresponding to the 
SMTP simple mail transfer protocol email address, and an attachment, the first email message 
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further including displaying characters and non-displaying characters; 

searching logic configured to search for the non-displaying characters in the first 
email message; 

removing logic configured to remove the non-displaying characters, including 
non-displaying comments and non-displaying control characters; 

first determine logic configured to determine non-alphabetic displaying characters 
in the first email message, where determining the non-alphabetic displaying characters includes 
a per-character analysis that recursively determines for each character whether: 
a character is a non-alphabetic character; 

if the character is a non-alphabetic character, whether the character is a 

space; 

if the character is a space, determine whether the space is adjacent to a 

solitary "i" or "a": 

in response to a determination that the space is not adjacent to a solitary 
"i" or "a", deleting the non-alphabetic character; and 

if the non-alphabetic character is not a space, filtering the determined 
non-alphabetic displaying characters from the first email message; 

tokenize logic configured to tokenize the SMTP simple mail transfer protocol 
email address to generate an address token representative of the SMTP simple mail transfer 
protocol email address; 

tokenize logic configured to tokenize the attachment to generate an attachment 
token that is representative of the attachment; 

tokenize logic configured to tokenize the domain name to generate a domain 
token representative of the domain name; 

analysis logic configured to determine a corresponding spam probability value 
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from the address token, the attachment token, and the domain token; and 

second determine logic configured to determine whether at least one of the 
address token, the attachment token, and the domain token is present in a database of tokens 
and, in response to a determination that at least one of the address token, the attachment 
token, and the domain token is present in the database of tokens: 

update the corresponding spam probability value of the address token, 
the attachment token, and the domain token; 

sort the address token, the attachment token, and the domain token in 
accordance with the corresponding spam probability value to determine a predefined number of 
interesting tokens, the predefined number of interesting tokens being a subset of the address 
token, the attachment token, and the domain token, wherein only displaying characters are 
tokenized; and 

filter a second email message. 

24. (Canceled) 

25. (Currently Amended) A non-transitory computer-readable storage medium that 
includes a program that, when executed by a computer, performs at least the following: 

receive a first email message comprising an SMTP a simple mail transfer protocol email 
address, a domain name corresponding to the SMTP simple mail transfer protocol email 
address, and an attachment, the first email message further including displaying characters and 
non-displaying characters; 

search for non-displaying characters in the first email message; 

remove the non-displaying characters, including non-displaying comments and non- 
displaying control characters; 
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determine non-alphabetic displaying characters in the first email message, where 
determining the non-alphabetic displaying characters includes a per-character analysis that 
recursively determines for each character whether: 

a character is a non-alphabetic character: 

if the character is a non-alphabetic character, whether the character is a space; 
if the character is a space, determine whether the space is adjacent to a solitary 

"i" or "a"; 

in response to a determination that the space is not adjacent to a solitary "i" or 
"a", deleting the non-alphabetic character: and 

if the non-alphabetic character is not a space, filtering the determined non- 
alphabetic displaying characters from the first email message: 

tokenize the SMTP simple mail transfer protocol email address to generate an address 
token representative of the SMTP simple mail transfer protocol email address; 

tokenize the attachment to generate an attachment token that is representative of the 
attachment; 

tokenize the domain name to generate a domain token representative of the domain 

name; 

determine a corresponding spam probability value from the address token, the 
attachment token, and the domain token; and 

determine whether at least one of the address token, the attachment token, and the 
domain token is present in a database of tokens and, in response to a determination that at 
least one of the address token, the attachment token, and the domain token is present in the 
database of tokens: 

update the corresponding spam probability value of the address token, the 
attachment token, and the domain token; 
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sort the address token, the attachment token, and the domain token in 
accordance with the corresponding spam probability value to determine a predefined number of 
interesting tokens, the predefined number of interesting tokens being a subset of the generated 
tokens, wherein only displaying characters are tokenized; and 

filter a second email message. 

26. (Currently Amended) The non-transitory computer-readable storage medium of claim 

25, the program further causing the computer to perform at least the following: 

assign an address spam probability value to the address token representative of the 
SMTP simple mail transfer protocol email address; 

assign a domain spam probability value to the domain token representative of the 
domain name; and 

generate a Bayesian probability value using the address spam probability value and the 
domain spam probability value assigned to the tokens. 

27. (Currently Amended) The non-transitory computer-readable storage medium of claim 

26, the program further causing the computer to perform at least the following: 

compare the Bayesian probability value with a predefined threshold value. 

28. (Currently Amended) The non-transitory computer-readable storage medium of claim 

27, the program further causing the computer to perform at least the following: 

categorize the first email message as spam in response to the Bayesian probability 
value being greater than the predefined threshold. 
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29. (Currently Amended) The non-transitory computer-readable storage medium of claim 
27, the program further causing the computer to perform at least the following: 

categorize the first email message as non-spam in response to the Bayesian probability 
value being not greater than the predefined threshold. 



30. (Currently Amended) A system comprising: 

a memory component that stores at least the following: 

email receive logic configured to receive a first email message comprising an 
attachment and an address, the email message further including displaying characters and non- 
displaying characters; 

search logic configured to search for the non-displaying characters in the first 
email message; 

remove logic configured to remove the non-displaying characters, including non- 
displaying comments and non-displaying control characters; 

determine logic configured to determine non-alphabetic displaying characters in 
the first email message, where determining the non-alphabetic displaying characters includes a 
per-character analysis that recursively determines for each character whether: 
a character is a non-alphabetic character; 

if the character is a non-alphabetic character, whether the character is a 

space; 

if the character is a space, determine whether the space is adjacent to a 

solitary "i" or "a"; 

in response to a determination that the space is not adjacent to a solitary 
"i" or "a", deleting the non-alphabetic character; and 

if the non-alphabetic character is not a space, filtering the determined 
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non-alphabetic displaying characters from the first email message; 

tokenize logic configured to generate at least one attachment token 
representative of the attachment; 

analysis logic configured to determine a corresponding spam probability value 
from the at least one attachment token; and 

database determining logic configured to determine whether the at least one 
attachment token is present in a database of tokens and, in response to a determination that the 
at least one attachment token is present in the database of tokens: 

update the corresponding spam probability value of the at least one 

attachment token; 

sort the at least one attachment token in accordance with the 
corresponding spam probability value to determine a predefined number of interesting tokens, 
the predefined number of interesting tokens being a subset of the at least one attachment 
token, wherein only displaying characters are tokenized; and 

filter a second email message. 

31. (Canceled) 

32. (Currently Amended) A non-transitory computer-readable storage medium that 
includes a program that, when executed by a computer, performs at least the following: 

receive a first email message comprising an attachment and an address, the first email 
message further including displaying characters and non-displaying characters; 

search for the non-displaying characters in the first email message; 

remove the non-displaying characters, including non-displaying comments and non- 
displaying control characters; 
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determine non-alphabetic displaying characters in the first email message, where 
determining the non-alphabetic displaying characters includes a per-character analysis that 
recursively determines for each character whether: 

a character is a non-alphabetic character: 

if the character is a non-alphabetic character, whether the character is a space; 
if the character is a space, determine whether the space is adjacent to a solitary 

"i" or "a"; 

in response to a determination that the space is not adjacent to a solitary "i" or 
"a", deleting the non-alphabetic character: and 

if the non-alphabetic character is not a space, filtering the determined non- 
alphabetic displaying characters from the first email message: 

generate at least one attachment token representative of the attachment; 

determine a spam probability value from the at least one attachment token; and 

determine whether the at least one attachment token is present in a database of tokens 
and, in response to a determination that the at least one attachment token is present in the 
database of tokens: 

update the spam probability value of the at least one attachment token; 

sort the at least one attachment token in accordance with the spam probability value to 
determine a predefined number of interesting tokens, the predefined number of interesting 
tokens being a subset of the generated tokens, wherein only displaying characters are 
tokenized; and 

filter a second email message. 
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33. (Currently Amended) The non-transitory computer-readable storage medium of claim 

32, the program further causing the computer to perform at least the following: 

receive the first email message having a text body. 

34. (Currently Amended) The non-transitory computer-readable storage medium of claim 

33, the program further causing the computer to perform at least the following: 

tokenize words in the text body to generate body tokens representative of the words in 
the text body. 

35. (Currently Amended) The non-transitory computer-readable storage medium of claim 

34, 

assign a body spam probability value to each of the body tokens representative of the 
words in the text body; 

assign an attachment spam probability value to the token representative of the 
attachment; and 

generate a Bayesian probability value using the the attachment spam probability and the 
body spam probability assigned to the the body tokens and the attachment token. 

36. (Currently Amended) The non-transitory computer-readable storage medium of claim 
35, the program further causing the computer to perform at least the following: 

compare the Bayesian probability value with a predefined threshold value. 
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37. (Currently Amended) The non-transitory computer-readable storage medium of claim 
36, the program further causing the computer to perform at least the following: 

categorize the first email message as spam in response to the Bayesian probability 
value being greater than the predefined threshold. 

38. (Currently Amended) The non-transitory computer-readable storage medium of claim 
36, the program further causing the computer to perform at least the following: 

categorize the first email message as non-spam in response to the Bayesian probability 
value being not greater than the predefined threshold. 

39. (Previously Presented) The method of claim 1, wherein the first email message is 
received at a computing device. 



40. (Canceled) 
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REMARKS 

Assignee respectfully requests entry of the following amendments and remarks in 
response to the Non-Final Office Action mailed July 20, 2010. Assignee respectfully submits 
that the amendments and remarks contained herein place the instant application in condition for 
allowance. 

Upon entry of the amendments in this response, claims 1, 6, 11-14, 16, 17, 19-23, 25- 
30, and 32-39 are pending. In particular, Assignee amends claims 1,6, 11, 23, 25-30, and 32- 
38 and cancels claims 24, 31, and 40. Reconsideration and allowance of the application and 
presently pending claims are respectfully requested. 

I. Allowable Subject Matter 

Assignee acknowledges the Examiner's indication on page 2 of the Office Action that 
claim 40 would be allowable if rewritten to include all of the limitations of the base claim and any 
intervening claims. In that it is believed that every rejection and objection has been overcome, it 
is respectfully submitted that each of the claims that remains in the case is presently in 
condition for allowance. 

II. Response to Rejection of Claims under 35 U.S.C. §101 

The Office Action rejects claims 25-29 and 32-38 under 35 U.S.C. § 101 and requests 
that the claims be rewritten to recite that the medium is non-transitory. Accordingly, the claims 
have been rewritten as requested. Withdrawal of the rejection is respectfully requested. 
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III. Response to Rejection of Claims under 35 U.S.C. §112 

The Office Action rejects claims 24 and 31 under 35 U.S.C. § 1 12, first paragraph, as 
allegedly failing to comply with the written description requirement. Claims 24 and 31 are 
canceled without prejudice, waiver, or disclaimer, and therefore, the rejection to the claims is 
rendered moot. Assignee takes this action merely to reduce the number of disputed issues and to 
facilitate early allowance and issuance of other claims in the present application. Assignee 
reserves the right to pursue the subject matter of the canceled claims in a continuing application, if 
Assignee so chooses, and does not intend to dedicate any of the canceled subject matter to the 
public. 

IV. Response to Rejection of Claims under 35 U.S.C. §103 

The Office Action indicates that claims 1 and 39 stand rejected under 35 U.S.C. §1 03(a) 
as allegedly being unpatentable over U.S. Patent Publication Number 2004/0093384 {"Shipp") 
in view of U.S. Patent Publication Number 2004/0073617 ("Milliken") further in view of U.S. 
Patent Number 7,320,020 B2 ('Chadwick") further in view of "A Bayesian Approach to Filtering 
Junk E-Mail" ("Sahami') further in view of "Identifying Junk Electronic Mail In Microsoft Outlook 
with a Support Vector Machine" ('Woitaszek") further in view of U.S. Patent Number 6,968,571 
('Devine") further in view of U.S. Patent Publication Number 2004/0107189 ('Burdick") further in 
view of "An Information Retrieval System Based on Superimposed Coding" ('Files") further in 
view of U.S. Patent Publication Number 2004/0064537 ('Anderson") further in view of 
"Uuencode and MIME FAQ". Claims 6, 16, 17, 23, 24, and 25 stand rejected under 35 U.S.C. 
§1 03(a) as allegedly being unpatentable over Shipp in view of Milliken further in view of 
Chadwick further in view of Woitaszek. Claims 11-14, 19-22, and 26-29 stand rejected under 35 
U.S.C. §1 03(a) as allegedly being unpatentable over Shipp in view of Milliken further in view of 
Chadwick further in view of Woitaszek further in view of Sahami. Claims 30-34 stand rejected 
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under 35 U.S.C. §1 03(a) as allegedly being unpatentable over Milliken in view of Chadwick 
further in view of Woitaszek. Claims 35-38 stand rejected under 35 U.S.C. §1 03(a) as allegedly 
being unpatentable over Milliken in view of Chadwick further in view of Woitaszek further in view 
of Sahami. 

Assignee traverses the rejections for at least the following reasons. More specifically, 
claim 1 has been rewritten to include the allowable subject matter of claim 40. Therefore, the 
rejection of independent claim 1 should be withdrawn. 

Similarly, independent claims 6, 23, 25, 30, and 32 have been rewritten to include similar 
subject matter as allowable claim 40. Therefore, independent claims 6, 23, 25, 30, and 32 are 
believed to be allowable over the cited art. Accordingly, dependent claims 11-14, 16, 17, 19-22, 
26-29, and 33-39 are also believed to be allowable over the cited art. 

Claim 40 is canceled without prejudice, waiver, or disclaimer. Assignee takes this action 
merely to reduce the number of disputed issues and to facilitate early allowance and issuance of 
other claims in the present application. 
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CONCLUSION 

For at least the reasons set forth above, all objections and/or rejections have been 
traversed, rendered moot, and/or addressed, and that the now pending claims are in condition 
for allowance. Favorable reconsideration and allowance of the present application and all 
pending claims are hereby courteously requested. 

Any other statements in the Office Action that are not explicitly addressed herein are not 
intended to be admitted. In addition, any and all findings of inherency are traversed as not 
having been shown to be necessarily present. Furthermore, any and all findings of well-known 
art and Official Notice, or statements interpreted similarly, should not be considered well-known 
for the particular and specific reasons that the claimed combinations are too complex to support 
such conclusions and because the Office Action does not include specific findings predicated on 
sound technical and scientific reasoning to support such conclusions. 

If, in the opinion of the Examiner, a telephonic conference would expedite the examination 
of this matter, the Examiner is invited to call the undersigned attorney at (770) 933-9500. 

Respectfully submitted, 

/Charles W. Griqqers/ 

Charles W. Griggers 
Reg. No. 47,283 

AT&T Legal Department - TKHR 

Attn: Patent Docketing 
One AT&T Way 
Room 2A-207 
Bedminster, NJ 07921 
Customer No.: 38823 



